Series, Weighted Automata, Probabilistic Automata and Probability Distributions for Unranked Trees

نویسندگان

  • Édouard Gilbert
  • Marc Tommasi
  • Rémi Gilleron
چکیده

We study tree series and weighted tree automata over unranked trees. The message is that recognizable tree series for unranked trees can be defined and studied from recognizable tree series for binary representations of unranked trees. For this we prove results of [1] as follows. We extend hedge automata – a class of tree automata for unranked trees – to weighted hedge automata. We define weighted stepwise automata as weighted tree automata for binary representations of unranked trees. We show that recognizable tree series can be equivalently defined by weighted hedge automata or weighted stepwise automata. Then we consider real-valued tree series and weighted tree automata over the field of real numbers. We show that the result also holds for probabilistic automata – weighted automata with normalisation conditions for rules. We also define convergent tree series and show that convergence properties for recognizable tree series are preserved via binary encoding. From [21], we present decidability results on probabilistic tree automata and algorithms for computing sums of convergent series. Last we show that streaming algorithms for unranked trees can be seen as slight transformations of algorithms on the binary representations. Key-words: Tree automata, weighted automata, xml ∗ This work was supported by the ANR Lampada project ANR-09-EMER-007 in ria -0 04 55 95 5, v er si on 2 9 M ar 2 01 0 Résumé : We study tree series and weighted tree automata over unranked trees. The message is that recognizable tree series for unranked trees can be defined and studied from recognizable tree series for binary representations of unranked trees. For this we prove results of [1] as follows. We extend hedge automata – a class of tree automata for unranked trees – to weighted hedge automata. We define weighted stepwise automata as weighted tree automata for binary representations of unranked trees. We show that recognizable tree series can be equivalently defined by weighted hedge automata or weighted stepwise automata. We also show that the result holds for probabilistic automata – weighted automata over the field of real numbers with normalisation condition for rules. We also claim that convergence properties for recognizable tree series are preserved. Last we show that streaming algorithms for unranked trees can be seen as slight transformations of algorithms on the binary representations. Mots-clés : Automates d’arbres, Automates pondérés, xml in ria -0 04 55 95 5, v er si on 2 9 M ar 2 01 0 Series and Unranked Weighted Tree Automata 3

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Probability Distributions for Trees: Representations, Inference and Learning

We study probability distributions over free algebras of trees. Probability distributions can be seen as particular (formal power) tree series [BR82; EK03], i.e. mappings from trees to a semiring K. A widely studied class of tree series is the class of rational (or recognizable) tree series which can be defined either in an algebraic way or by means of multiplicity tree automata. We argue that ...

متن کامل

Equivalences between Ranked and Unranked Weighted Tree Automata via Binarization

Encoding unranked trees to binary trees, henceforth called binarization, is an important method to deal with unranked trees. For each of three binarizations we show that weighted (ranked) tree automata together with the binarization are equivalent to weighted unranked tree automata; even in the probabilistic case. This allows to easily adapt training methods for weighted (ranked) tree automata ...

متن کامل

Bisimulation Minimisation of Weighted Automata on Unranked Trees

Two examples of automata-theoretic models for the validation of xml documents against user-de ned schema are the stepwise unranked tree automaton (suta) and the parallel unranked tree automaton (puta). By adding a weight, taken from some semiring, to every transition we generalise these two qualitative automata models to quantitative models, thereby obtaining weighted stepwise unranked tree aut...

متن کامل

The Supports of Weighted Unranked Tree Automata

Unranked trees are a common concept in computer science. For instance (fully structured) XML-documents can be seen as unranked trees. With the help of unranked tree automata, one can investigate qualitative questions on XMLdocuments. To allow the study of quantitative aspects, Droste and Vogler (TOCS 2011) recently proposed and investigated bottom-up weighted unranked tree automata over semirin...

متن کامل

A Spectral Approach for Probabilistic Grammatical Inference on Trees

We focus on the estimation of a probability distribution over a set of trees. We consider here the class of distributions computed by weighted automata a strict generalization of probabilistic tree automata. This class of distributions (called rational distributions, or rational stochastic tree languages RSTL) has an algebraic characterization: All the residuals (conditional) of such distributi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010